Symmetric Indefinite Linear Solver using OpenMP Task on Multicore Architectures

نویسندگان

  • Ichitaro Yamazaki
  • Jakub Kurzak
  • Panruo Wu
  • Mawussi Zounon
  • Jack Dongarra
چکیده

Recently, the Open Multi-Processing (OpenMP) standard has incorporated task-based programming, where a function call with input and output data is treated as a task. At run time, OpenMP’s superscalar scheduler tracks the data dependencies among the tasks and executes the tasks as their dependencies are resolved. On a shared-memory architecture with multiple cores, the independent tasks are executed on different cores in parallel, thereby enabling parallel execution of a seemingly sequential code. With the emergence of many-core architectures, this type of programming paradigm is gaining attention—not only because of its simplicity, but also because it breaks the artificial synchronization points of the program and improves its thread-level parallelization. In this paper, we use these new OpenMP features to develop a portable high-performance implementation of a dense symmetric indefinite linear solver. Obtaining high performance from this kind of solver is a challenge because the symmetric pivoting, which is required to maintain numerical stability, leads to data dependencies that prevent us from using some common performance-improving techniques. To fully utilize a large number of cores through tasking, while conforming to the OpenMP standard, we describe several techniques. Our performance results on current many-core architectures—including Intel’s Broadwell, Intel’s Knights Landing, IBM’s Power8, and Arm’s ARMv8—demonstrate the portable and superior performance of our implementation compared with the Linear Algebra PACKage (LAPACK). The resulting solver is now available as a part of the PLASMA software package.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient distributed randomized solver with application to large dense linear systems

Randomized algorithms are gaining ground in high performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. In this paper, we propose a randomized algorithm for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter es...

متن کامل

An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems

Randomized algorithms are gaining ground in high-performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. We propose a randomized solver for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter estimation problems ...

متن کامل

On the Performance of an Algebraic Multigrid Solver on Multicore Clusters

Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG’s performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread...

متن کامل

Design of a Multicore Sparse Cholesky Factorization Using DAGs

The rapid emergence of multicore machines has led to the need to design new algorithms that are efficient on these architectures. Here, we consider the solution of sparse symmetric positive-definite linear systems by Cholesky factorization. We were motivated by the successful division of the computation in the dense case into tasks on blocks and use of a task manager to exploit all the parallel...

متن کامل

Analyzing Performance and Power of Multicore Architecture Using Multithreaded Iterative Solver

Problem statement: Scientific modeling and simulations have been popularly used with experiments and theoretical analysis in science and engineering communities. Approach: Consequently, computational demands are growing exponentially to afford large scale modeling and simulations. Results: As a result, multicore computing architectures had been proposed and several products are already availabl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018